AITopics | imbalanced class

Collaborating Authors

imbalanced class

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Review for NeurIPS paper: Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

Neural Information Processing SystemsJan-27-2025, 08:57:44 GMT

This paper proposes an approach to semi-supervised learning for imbalanced classes. It is indeed non-trivial to combine local/global/perturbation consistency-based semi-supervised methods and fully supervised methods for imbalanced classes---this paper may be the first work along this direction. The paper is quite general and can be applied on top of any pseudo-labeling-based semi-supervised methods. It first estimates the true class-prior probability and then updates/modifies the pseudo labels by pushing their class-prior probability with a constrained convex optimization. While in the beginning the reviewers had some concerns (mainly the clarity and too few datasets), the authors did a particularly good job in their rebuttal (showing that the class-prior probability can be estimated rather than must be given).

distribution aligning refinery, imbalanced semi-supervised learning, neurips paper, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)

Add feedback

Locality-preserving Directions for Interpreting the Latent Space of Satellite Image GANs

Kourmouli, Georgia, Kostagiolas, Nikos, Panagakis, Yannis, Nicolaou, Mihalis A.

arXiv.org Artificial IntelligenceSep-26-2023

We present a locality-aware method for interpreting the latent space of wavelet-based Generative Adversarial Networks (GANs), that can well capture the large spatial and spectral variability that is characteristic to satellite imagery. By focusing on preserving locality, the proposed method is able to decompose the weight-space of pre-trained GANs and recover interpretable directions that correspond to high-level semantic concepts (such as urbanization, structure density, flora presence) - that can subsequently be used for guided synthesis of satellite imagery. In contrast to typically used approaches that focus on capturing the variability of the weight-space in a reduced dimensionality space (i.e., based on Principal Component Analysis, PCA), we show that preserving locality leads to vectors with different angles, that are more robust to artifacts and can better preserve class information. Via a set of quantitative and qualitative examples, we further show that the proposed approach can outperform both baseline geometric augmentations, as well as global, PCA-based approaches for data synthesis in the context of data augmentation for satellite scene classification.

acc, augmentation, experiment, (14 more...)

arXiv.org Artificial Intelligence

2309.14883

Genre: Research Report > New Finding (0.46)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

How to Handle Imbalanced Classes in Machine Learning

#artificialintelligenceFeb-7-2023, 07:50:39 GMT

Imbalanced classes put "accuracy" out of business. This is a surprisingly common problem in machine learning (specifically in classification), occurring in datasets with a disproportionate ratio of observations in each class. Standard accuracy no longer reliably measures performance, which makes model training much trickier. Up-sampling minority class refers to the technique of oversampling the under-represented class in a binary classification problem to balance the class distribution. The idea behind up-sampling is to randomly duplicate examples from the minority class to increase its representation in the dataset and make the class distribution more balanced.

class distribution, imbalanced class, majority class, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Machine Learning Performance Analysis to Predict Stroke Based on Imbalanced Medical Dataset

Jing, Yuru

arXiv.org Artificial IntelligenceNov-14-2022

Cerebral stroke, the second most substantial cause of death universally, has been a primary public health concern over the last few years. With the help of machine learning techniques, early detection of various stroke alerts is accessible, which can efficiently prevent or diminish the stroke. Medical datasets, however, are frequently unbalanced in their class label, with a tendency to poorly predict minority classes. In this paper, the potential risk factors for stroke are investigated. Moreover, four distinctive approaches are applied to improve the classification of the minority class in the imbalanced stroke dataset, which are the ensemble weight voting classifier, the Synthetic Minority Over-sampling Technique (SMOTE), Principal Component Analysis with K-Means Clustering (PCA-Kmeans), Focal Loss with the Deep Neural Network (DNN) and compare their performance. Through the analysis results, SMOTE and PCA-Kmeans with DNN-Focal Loss work best for the limited size of a large severe imbalanced dataset (e.g., Stroke dataset), which is 2-4 times outperform Kaggle's work.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.07652

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Switzerland (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.96)
Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Consumer Health (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

How to evaluate a machine learning model - part 4- Edvancer Eduventures

#artificialintelligenceMay-26-2022, 18:21:02 GMT

This blog post is the continuation of my previous articles part 1, part 2 and part 3. Caution: The Difference Between Training Metrics and Evaluation Metrics Sometimes, the model training procedure uses a different metric (also known as a loss function) than the evaluation. This can happen in the instance when we are re-appropriating a model for a different task than it was designed for. For example, we might train a personalized recommender by minimizing the loss between its predictions and observed ratings, and then use this recommender to produce a ranked list of recommendations. This is not an optimal scenario. It makes the life of the model difficult by asking it to do a task that it was not trained to do.

metric, outlier, recommender, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Classification with Imbalanced Data

#artificialintelligenceNov-21-2021, 14:15:34 GMT

Building classification models on data that has largely imbalanced classes can be difficult. Using techniques such as oversampling, undersampling, resampling combinations, and custom filtering can improve accuracy. In this article, I'll walk through a few different approaches to deal with data imbalance in classification tasks. To demonstrate various class imbalance techniques, a fictitious dataset of credit card defaults will be used. In our scenario, we are trying to build an explainable classifier that takes two inputs (age and card balance) and predicts whether someone will miss an upcoming payment.

decision boundary, imbalanced class, resampled data, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.30)

Add feedback

Classification with Imbalanced Data

#artificialintelligenceNov-20-2021, 08:33:06 GMT

decision boundary, imbalanced class, resampled data, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.30)

Add feedback

Deep Metric Learning Model for Imbalanced Fault Diagnosis

Gui, Xingtai, Zhang, Jiyang

arXiv.org Artificial IntelligenceJul-14-2021

Intelligent diagnosis method based on data-driven and deep learning is an attractive and meaningful field in recent years. However, in practical application scenarios, the imbalance of time-series fault is an urgent problem to be solved. This paper proposes a novel deep metric learning model, where imbalanced fault data and a quadruplet data pair design manner are considered. Based on such data pair, a quadruplet loss function which takes into account the inter-class distance and the intra-class data distribution are proposed. This quadruplet loss pays special attention to imbalanced sample pair. The reasonable combination of quadruplet loss and softmax loss function can reduce the impact of imbalance. Experiment results on two open-source datasets show that the proposed method can effectively and robustly improve the performance of imbalanced fault diagnosis.

baseline method, imbalance, recall rate, (16 more...)

arXiv.org Artificial Intelligence

2107.03786

Country:

North America > United States > Tennessee (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report (0.50)

Industry: Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What is Data Imbalance in Machine Learning?

#artificialintelligenceJul-2-2021, 07:16:18 GMT

A software platform for organizations and developers to responsibly deploy, monitor, and get value from AI - at scale. Data imbalance, or imbalanced classes, is a common problem in machine learning classification where the training dataset contains a disproportionate ratio of samples in each class. Examples of real-world scenarios that suffer from class imbalance include threat detection, medical diagnosis, and spam filtering. Class imbalance can make training efficient machine learning models difficult, especially when there aren't enough samples belonging to the class of interest. In the case of fraud detection, the amount of fraudulent transactions is negligible to the number of lawful transactions, making it difficult to train a machine learning model because the training dataset does not contain enough information about fraud.

data imbalance, imbalance, training dataset, (13 more...)

#artificialintelligence

Industry:

Law Enforcement & Public Safety > Fraud (0.58)
Health & Medicine (0.38)
Information Technology (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A walk through imbalanced classes in machine learning through a visual cheat sheet

#artificialintelligenceOct-9-2020, 14:35:51 GMT

There are many detailed articles explaining the problem of imbalanced training samples and how to cope up with it. In this article, I summarize the understanding of the problem into a visual cheat sheet. I often find it useful as it comes handy for me whenever I have to revert back to the basic definitions (or I have an interview lined up). The cheat sheet below starts with the background on why accuracy doesn't always give a correct insight related to your classification algorithm and then moves on to defining other meaningful performance metrics. The cheat sheet then provides an example showing how to calculate those metrics for a three-class classification problem.

artificial intelligence, machine learning, visual cheat sheet, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback